feat: competitive review pipeline improvements by haasonsaas · Pull Request #33 · evalops/diffscope

haasonsaas · 2026-03-11T15:38:25Z

Summary

File triage (File triage: classify files before expensive review #29): Heuristic pre-filter skips lock files, generated code, deletion-only, whitespace-only, and comment-only changes before expensive LLM calls
Dynamic context boundaries (Dynamic context: enclosing function/class boundary detection #25): Context expansion now snaps to enclosing function boundaries instead of fixed-window, with asymmetric padding (5 before / 1 after)
Robust LLM parsing (Robust LLM output parsing with fallback strategies #28): 5-strategy fallback chain (primary → numbered list → markdown bullets → file:line → JSON) so varied LLM output formats are handled gracefully
Verification pass (Verification pass to catch hallucinations #23): Second LLM call validates findings against actual code, scoring 0–10 with auto-zero for noise categories (docstrings, type hints, import ordering)
Multi-model routing (Multi-model routing: triage with cheap models, review with frontier #26): ModelRole enum (Primary/Weak/Reasoning/Embedding) with per-role model config and fallback chain
Adaptive patch compression (Adaptive patch compression for large PRs #30): 4-stage progressive degradation (Full → Compressed → Clipped → MultiCall) for large PRs that exceed token budgets

10 files changed, +2485 lines, 114 new tests (716 total, all passing).

Test plan

All 716 tests pass (cargo test)
Triage correctly skips lock files, generated files, deletion/whitespace/comment-only diffs (45 tests)
Function boundary detection works across Rust, Python, JS, Go, Java (5 tests)
All 5 LLM response parsing strategies produce correct output (11 tests)
Verification auto-zeros noise, parses scoring format, clamps scores (17 tests)
ModelRole falls back to primary when role-specific model not configured (12 tests)
Compression selects correct strategy at each stage, respects max_calls, accounts for all file indices (24 tests)
Manual: run a review on a large PR to verify compression logging and skipped-file summary

🤖 Generated with Claude Code

…sing Phase 1 improvements to the review pipeline, informed by competitive analysis of Greptile, CodeRabbit, and Qodo Merge architectures: - File triage (#29): Heuristic classifier skips lock files, generated files, deletion-only, whitespace-only, and comment-only changes before expensive LLM review. 45 unit tests. - Dynamic context boundaries (#25): Context fetcher now searches upward for enclosing function/class definitions instead of using fixed-window context. Uses asymmetric padding (more before, less after) matching Qodo's approach. 6 unit tests. - Robust LLM output parsing (#28): 5-strategy fallback chain for parsing LLM responses — primary format, numbered lists, markdown bullets, file:line format, and JSON extraction. Prevents silent comment loss when models deviate from the expected format. 11 unit tests. - Fix rust-embed v8 derive macro (Embed -> RustEmbed) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

- Verification pass (#23): LLM validates review findings against actual code, scoring 0-10 with auto-zero for noise (docstrings, type hints, imports). Filters hallucinated comments before output. 17 new tests. - Multi-model routing (#26): ModelRole enum (Primary/Weak/Reasoning/Embedding) with per-role model config and fallback chain. 12 new tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

4-stage progressive degradation strategy for diffs exceeding token budget: Stage 1 (Full): all diffs fit → use as-is Stage 2 (Compressed): remove deletion-only hunks, drop tail files Stage 3 (Clipped): truncate large files at hunk boundaries Stage 4 (MultiCall): split into multiple batches (up to 5) Integrated into review pipeline — files dropped by compression are skipped with logging. 24 new tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Prevent compression analysis from skipping files in per-file review mode, make verification diff truncation UTF-8 safe, and tighten comment-only triage to avoid dereference false positives while improving compression packing order. Made-with: Cursor

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Instead of failing the workflow, emit a notice and skip the review and comment-posting steps when OPENAI_API_KEY is not set. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…ding default - `#` comment prefix: exclude Rust attributes (#[...], #![...]), C preprocessor directives (#include, #define, etc.) from comment detection - `generated/` match: use boundary-aware checks (starts_with or /generated/) to avoid false positives on paths like `user_generated/` - Embedding model default: fall back to primary model instead of hardcoding OpenAI's text-embedding-3-small, respecting self-hosted provider support - 4 new tests for the triage fixes Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

…errors Use the Weak model role for the verification pass — it's a validation task that doesn't need the frontier model. This also resolves the clippy dead_code errors for ModelRole, model_for_role, and to_model_config_for_role by giving them their first real callsite. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

devin-ai-integration

Devin Review found 4 new potential issues.

View 12 additional findings in Devin Review.

devin-ai-integration · 2026-03-11T18:15:46Z

src/review/triage.rs

+    // 3. Deletion-only — all non-context lines are Removed
+    if all_changes
+        .iter()
+        .all(|c| matches!(c.change_type, ChangeType::Removed))
+    {
+        return TriageResult::SkipDeletionOnly;
+    }


🔴 Triage SkipDeletionOnly silently skips review of security-relevant code removals

The new triage system classifies any diff where all non-context changes are Removed as SkipDeletionOnly, causing these files to be silently skipped from LLM review (src/review/pipeline.rs:311-320). This means removal of security checks, validation logic, error handling, authentication guards, or rate-limiting code would never be reviewed.

Example of dangerous deletion-only diff that would be skipped

If a developer removes an authentication check:

- if !user.is_authorized() { - return Err("Unauthorized"); - }

This is a deletion-only change. The triage at triage.rs:90-94 classifies it as SkipDeletionOnly, and the pipeline at pipeline.rs:312-320 skips the file entirely — no LLM review occurs for this security regression.

Deletion-only changes are a well-known category of security-relevant refactors. Unlike the pre-existing diff.is_deleted check (which skips fully deleted files — a reasonable heuristic), SkipDeletionOnly applies to files that still exist but had code removed, which is exactly the pattern for accidentally removing safeguards.

Prompt for agents

In src/review/triage.rs, remove the SkipDeletionOnly classification entirely, or make it configurable. The deletion-only check at lines 89-95 in triage_diff() should be removed because deletion-only changes can represent security-relevant code removals (e.g., removing authentication checks, validation, error handling). If kept at all, it should be gated behind a config flag like triage_skip_deletions: bool (default false). Also update the TriageResult enum, should_skip(), and reason() to remove or gate the variant. In src/review/pipeline.rs around line 311-320, the triage skip logic is fine as-is — it just needs the underlying triage heuristic to be less aggressive.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-03-11T18:15:47Z

src/review/triage.rs

+];
+
+/// Comment line prefixes (after trimming leading whitespace).
+const COMMENT_PREFIXES: &[&str] = &["//", "# ", "/*", "*/", "* ", "--", "<!--", "\"\"\"", "'''"];


🟡 COMMENT_PREFIXES includes "* " which false-positives on Markdown list items and other non-comment content

The is_comment_line function at src/review/triage.rs:176-193 uses COMMENT_PREFIXES to detect comment-only changes, but it's file-type-agnostic. The prefix "* " (intended for C/Java/Rust doc-comment continuation lines like * @param x) also matches Markdown list items (* Fix the build process). A diff that only modifies Markdown list items, horizontal rules (--), or headings (# ) in .md/.sql/other files would be classified as SkipCommentOnly and silently skipped from review. For example, changes to README, CHANGELOG, or API documentation that only modify list items would never be reviewed.

Prompt for agents

In src/review/triage.rs, the is_comment_line function (line 176) should either: (1) accept the file extension as a parameter so it can apply language-specific comment prefix lists, or (2) the triage_diff function should exclude known non-code file extensions (.md, .txt, .rst, .adoc) from the SkipCommentOnly classification. The COMMENT_PREFIXES list at line 44 is correct for code files but causes false positives for Markdown/documentation files where '* ' is a list item, '# ' is a heading, and '--' is a horizontal rule.

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-03-11T18:15:49Z

.github/workflows/diffscope.yml

+          if [ -z "${{ secrets.OPENAI_API_KEY }}" ]; then
+            echo "skip=true" >> "$GITHUB_OUTPUT"
+            echo "::notice::DiffScope review skipped — OPENAI_API_KEY secret not configured"
+          else
+            echo "skip=false" >> "$GITHUB_OUTPUT"
+          fi


🔴 GitHub Actions secret directly interpolated into shell command allows injection

At line 33, ${{ secrets.OPENAI_API_KEY }} is expanded by GitHub Actions' template engine before the shell runs, meaning the raw secret value is pasted directly into the if [ -z "..." ] command. If the secret contains shell-special characters (e.g., ", `, $(...), ;, |), it can break the script or, worse, execute arbitrary commands.

Safe pattern using environment variable

The standard fix is to pass the secret through an environment variable:

- name: Check API key id: check_key env: KEY: ${{ secrets.OPENAI_API_KEY }} run: | if [ -z "$KEY" ]; then echo "skip=true" >> "$GITHUB_OUTPUT" else echo "skip=false" >> "$GITHUB_OUTPUT" fi

This way the value is never interpolated into the script text itself.

Suggested change

if [ -z "${{ secrets.OPENAI_API_KEY }}" ]; then

echo "skip=true" >> "$GITHUB_OUTPUT"

echo "::notice::DiffScope review skipped — OPENAI_API_KEY secret not configured"

else

echo "skip=false" >> "$GITHUB_OUTPUT"

fi

if [ -z "$OPENAI_KEY" ]; then

echo "skip=true" >> "$GITHUB_OUTPUT"

echo "::notice::DiffScope review skipped — OPENAI_API_KEY secret not configured"

else

echo "skip=false" >> "$GITHUB_OUTPUT"

fi

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration · 2026-03-11T18:15:51Z

src/core/function_chunker.rs

+pub fn find_enclosing_boundary_line(
+    content: &str,
+    file_path: &Path,
+    line: usize,
+    _max_search_lines: usize,
+) -> Option<usize> {
+    let language = detect_language(file_path);
+    let boundaries = detect_function_boundaries(content, &language);
+    // Find the innermost function containing this line
+    find_enclosing_function(&boundaries, line).map(|b| b.start_line)
+}


🟡 find_enclosing_boundary_line ignores max_search_lines parameter, violating its documented contract

The function's docstring at src/core/function_chunker.rs:285-287 states it "Searches upward from line up to max_search_lines", but the implementation at line 292 prefixes the parameter with _ and completely ignores it. Instead it scans all function boundaries in the entire file. The caller at src/core/context.rs:56-57 passes 10 and then applies its own post-hoc filter (boundary >= start.saturating_sub(10)) to compensate, but if anyone else calls find_enclosing_boundary_line trusting the documented parameter semantics, the limit won't be respected.

Prompt for agents

In src/core/function_chunker.rs at line 292, the _max_search_lines parameter is unused. Either implement the documented behavior (filter boundaries to only those within max_search_lines above the given line), or remove the parameter from the function signature and update the docstring. If you keep it, the implementation should filter like: find_enclosing_function(&boundaries, line).filter(|b| line - b.start_line <= max_search_lines).map(|b| b.start_line).

Was this helpful? React with 👍 or 👎 to provide feedback.

haasonsaas and others added 3 commits March 11, 2026 08:25

This comment was marked as resolved.

Sign in to view

haasonsaas and others added 4 commits March 11, 2026 10:56

fix: rustfmt formatting and bump quinn-proto 0.11.14 (RUSTSEC-2026-0037)

27490eb

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

fix: skip DiffScope self-review when API key not configured

a05f873

Instead of failing the workflow, emit a notice and skip the review and comment-posting steps when OPENAI_API_KEY is not set. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

devin-ai-integration bot reviewed Mar 11, 2026

View reviewed changes

haasonsaas merged commit 1b62f94 into main Mar 11, 2026
6 checks passed

haasonsaas deleted the feat/phase1-triage-context-parsing branch March 11, 2026 18:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: competitive review pipeline improvements#33

feat: competitive review pipeline improvements#33
haasonsaas merged 8 commits intomainfrom
feat/phase1-triage-context-parsing

haasonsaas commented Mar 11, 2026 •

edited by devin-ai-integration bot

Loading

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Uh oh!

devin-ai-integration bot Mar 11, 2026

Uh oh!

devin-ai-integration bot Mar 11, 2026

Uh oh!

devin-ai-integration bot Mar 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

haasonsaas commented Mar 11, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

This comment was marked as resolved.

Uh oh!

This comment was marked as resolved.

Uh oh!

devin-ai-integration bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration bot Mar 11, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

haasonsaas commented Mar 11, 2026 •

edited by devin-ai-integration bot

Loading